-
Notifications
You must be signed in to change notification settings - Fork 491
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Iqss/7493 batch archiving api call #7494
Iqss/7493 batch archiving api call #7494
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this looks fine but @qqmyers can you please review the comments I made and make sure make html
works?
@@ -866,9 +866,9 @@ For example: | |||
|
|||
``cp <your key file> /usr/local/payara5/glassfish/domains/domain1/files/googlecloudkey.json`` | |||
|
|||
.. _Archiving API Call: | |||
.. _Archiving API Calls: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At first I thought this anchor was being used and references to it should be change but apparently it isn't. It sort of makes me wonder if we should delete it if it isn't being used. No strong preference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah - it seems random as to whether anchors exist or not. They can be useful beyond just internal links (e.g. you can use them to point people directly to that section via URL in email, etc., so having anchors might be a good default for any significant topics. That said, not a show stopper if it gets deleted.
} catch (EJBException e) { | ||
logger.log(Level.WARNING, "EJBException exception: {0}", e.getMessage()); | ||
return null; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are 9 spaces instead of 8 in a couple places above. Also, the Netbeans formatter prefers try {
instead of try{
.
@@ -987,7 +987,7 @@ public boolean doesChecksumExistInDatasetVersion(DatasetVersion datasetVersion, | |||
|
|||
|
|||
|
|||
public HashMap getFileMetadataHistory(DataFile df){ | |||
public HashMap<?, ?> getFileMetadataHistory(DataFile df){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change to getFileMetadataHistory
seems to have nothing to do with this pull request and in fact I don't think the method is used at all. (I deleted it as was still able to compile the app.) I would suggest reverting this change and maybe adding a comment/question to the effect of "Is this method used?"
AbstractSubmitToArchiveCommand cmd = ArchiverUtil.createSubmitToArchiveCommand(className, dvRequestService.getDataverseRequest(), dsl.get(0)); | ||
|
||
if (cmd != null) { | ||
new Thread(new Runnable() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not personally very familiar with threads but there do seem to be a couple other places in the code (also added by Jim) that use this new Thread/new Runnable pattern and work fine. I think Jakarta EE offers various ways of handling threads and I know we use @Asynchronous
in certain places. I'm not suggesting a change but perhaps threads or async in general could be a topic for a future tech hours so we're all clear on the various options.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! FWIW: I don't know details but I think this is basically an EJB vs plain Java difference. Using @Asyncrhonous starts a thread in a managed pool (and I think there's a ManagedExcutorService that would allow you to specify a separate pool that could be managed in the EJB config). Probably would be good to recommend Asynch and think about updating these at some point. That said, since this call just starts one thread to serially chunk through things so I don't think it should cause problems as is (and having others who know more weigh in before making a change might be useful).
Co-authored-by: Philip Durbin <[email protected]>
Co-authored-by: Philip Durbin <[email protected]>
Co-authored-by: Philip Durbin <[email protected]>
…om/TexasDigitalLibrary/dataverse.git into IQSS/7493-batch_archiving_API_call
What this PR does / why we need it: Provides a batch mechanism for archiving via the OAI-ORE/Bag mechanism.
Which issue(s) this PR closes:
Closes #7493
Special notes for your reviewer: PR adds a query to find unarchived versions (which have a null entry in the db for archivalCopyLocation) and then loops through the results to archive versions. The archiving of a single version is the same as if the existing single version API call was used.
Suggestions on how to test this: Configure an archiver - probably the file archiver, and call the API with various combinations of the three query params. The result should be a simple json list of versions (for listonly) or the sequential appearance of zipped bags for the correct versions in the specified location (e.g. up to the limit, only for the most recently published version of a dataset, etc.). To retry, setting the archivalCopyLocation for a given datasetversion in the db and removing any bag file from the final location will restore a version to the unarchived state. Note that the Bag contents aren't changed by this PR so there's no reason to inspect their contents. (If an issue such as a 0 size file is found, the same thing should happen if you call the archive a single version API and it would be a bug in the ORE/Bag generator code that should be a new issue.)
Does this PR introduce a user interface change? If mockups are available, please link/include them here: No
Is there a release notes update needed for this change?: Probably worth noting in release docs
Additional documentation: